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A METHOD AND AN APPARATUS FOR CONTROLLING THE RATE OF A 
VIDEO SEQUENCE; A VIDEO ENCODING DEVICE 

This invention relates to a method and an apparatus for 
controlling the rate for encoding a video sequence and a 
video encoding device, wherein the available channel 
bandwidth and computational resources are taken into 
account . 

Background of Invention 

Rate control plays an important role in the encoding of live 
video over a channel with a limited bandwidth, for example 
over an internet or a wireless network, and has been widely 
studied by many researchers. Existing results on rate 
control as disclosed in [1], [2], [3], [4] are based on the 
assumption that the computational resources are always 
sufficient and hence, the desired encoding frame rate is 
always guaranteed . 

However when a live video is encoded via software under a 
multi-task environment, the computational resources of the 
Central Processing Unit (CPU) may not always be sufficient 
for the encoding process. This is due to the fact that the 
computational resources of the CPU may be taken up by other 
processes having a higher priority. In real time video 
coding systems, encoded bits are stored in a buffer before 
they are transmitted over the network to a decoder. When 
insufficient computational resources are allocated for the 
encoding process, the actual encoding frame rate is less 
than the desired frame rate, and the number of generated 
bits stored in buffer is too low. As a result, the available 
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channel bandwidth is wasted. This phenomenon is especially 
common when the video encoding process is implemented on a 
handheld device with limited computational capabilities. 

5 Also, most existing rate control methods are focused on the 
case that the available channel bandwidth for the 
transmission of the video is. constant. However, when the 
live video is transmitted over a limited bandwidth channel 
like the Internet or a wireless network, the available 

10 channel bandwidth for the transmission of the video usually 
varies over time. When the available bandwidth of the 
channel decreases, the number of bits in the buffer 
accumulates. When the number of bits in the buffer is too 
large, the encoder usually skips some frames to reduce the 

15 buffer delay and to avoid buffer overflow. Frame skipping 
produces undesirable motion discontinuity in the video 
sequence . 

A recent teaching in reference [5] discloses a rate control 
20 method that can adapt the encoding rate to the varying 

available bandwidth. The rate control method uses a fluid- 
flow model to compute a target bit rate for each frame of 
the video sequence. However, the rate control method as 
disclosed in [5] does not take into account the available 
25 computational resources. Moreover, the total number of bits 
allocated to each Group of Pictures (GOP) are distributed to 
each P frame in the GOP evenly. 

Summary of the Invention 

30 

It is an object of the invention to provide a rate control 
method that is suitable for live video encoding process with 
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varying computational resource and varying available 
bandwidth . 

The object is achieved by a method for controlling the rate 
5 for encoding a video sequence, wherein the video sequence 
comprises a plurality of Group Of Pictures (GOP), wherein 
each Group of Picture comprises at least an I-frame and an 
Inter-frame, the method comprising the following steps for 
the encoding of each Inter-frame in the Group of Picture; 

10 determining a desired frame rate based on an available 

bandwidth of a channel for transmitting the video sequence 
and on available computational resources for the encoding 
process; determining a target buffer level based on the 
desired frame rate and the position of the Inter-frame with 

15 respect to the I-frame; and determining a target bit rate 
based on the target buffer level and the available channel 
bandwidth, wherein the target bit rate is used for 
controlling the rate for encoding the video sequence. 

20 A GOP of the video sequence is assumed to comprise an I- 
frame (an Intra-frame, i.e. a frame, which is completely 
encoded without performing motion estimation and motion 
compensation) and a plurality of P-frames (Predictive- 
frames, i.e. frames which are encoded using motion 

25 estimation and motion compensation) or B-frames (Bi- 
directional-frames, i.e. frames which are encoded using 
motion estimation and motion compensation from two adjacent 
Intra-f rames) as Inter-frames. The bits are allocated to 
the I-frame based on its complexity, and the bits are 

30 allocated to each Inter-frame, preferably of each P-frame, 
using the rate control method according to the invention. 
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Although the rate control method, in particular the 
determining of the target buffer level and the corresponding 
target bit rate, is performed preferably on the P-frames of 
the GOPs, it should however be noted that the rate control 
5 method according to the invention may also be performed on 
the B-frames. 

When encoding the Inter- frame, preferably the P-frame, a 
desired frame rate is first determined based on the 
available channel bandwidth and the available computational 
resources for the encoding process. The desired frame rate 
does not remain constant, but changes adaptively for each 
Inter-frame depending on the available channel bandwidth and 
the available computational resources. 

When the available computational resources are insufficient 
to achieve the desired frame rate, the encoded bits 
accumulated in the encoder buffer is therefore low, 
resulting in buffer underflow and wastage of channel 
bandwidth. A target buffer level is therefore predefined to 
prevent buffer underflow by taking into account the 
available computational resource for the encoding process. 

The target buffer level defines how the total number of bits 
25 which are allocated to the GOP are to be distributed to each 
Inter-frame (preferably P-frame) of the GOP, i.e. the budget 
for each Inter-frame. However, there is normally a 
difference between the budget of each Inter-frame and the 
actual bits used by it. To ensure that each Inter- frame, and 
30 hence each GOP, uses its own budget, the target bit rate for 
each Inter-frame is computed. The target bit rate is 
computed using a fluid flow model and linear system control 
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15 



20 
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theory, and taking into account the target buffer level and 
the available channel bandwidth. 

The desired frame rate is determined by determining a target 
encoding time interval for the Inter-frame, preferably the 
P-frame, i.e. the time needed for encoding the Inter-frame. 
The target encoding time is inversely proportional to the 
desired frame rate, and is determined based on the available 
bandwidth and also preferably based on an average encoding 
time. The average encoding time interval for encoding the 
Inter-frame is proportional to the computational resources, 
and hence is indicative of the available computational 
resources. The available bandwidth can be estimated using 
the method disclosed in [6]. 

The target encoding time interval for encoding the Inter- 
frame is determined using the following equations: 



wherein 

Tfi(n) is the target encoding time interval or the target 

time needed to encode the Inter-frame, 

Ai is a parameter wherein 0.80 < Ai < 1.00, 

A 2 is a parameter wherein 1.00 < A 2 < 1.10, 

Bi is a parameter wherein 1.00 < Bi < 2.00, 

B 2 is a parameter wherein 0 < B 2 < 1.00, 

TB ma d(n) is the average of BmadCn), and 



T fi {n) = A^T fi {n-l) 
T fi (n) = A 2 *T fi (n-l) 
T fi {n) = T fi (n-l) 



otherwise, 



if B mad {n)>B l *TB mod {n), 
if B mad (n)<B 2 *TB mad (n), 
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Bmaddi) is related to the average encoding time interval T, 
by 



wherein 

u(n) is the available channel bandwidth, 

Tave(n-l) is the average encoding time interval for the 

Inter-frame, and 

MAD (n) is the mean absolute difference between the current 
frame and the previous frame. 

According to the invention, Ai is preferably set at 0.9, A 2 
is preferably set at 1.05, B x is preferably set at 1.5, and 
B 2 is preferably set at 0.25. 

The value of the target encoding time interval T fi (n) 
obtained is preferably further adjusted using the following 
equation: 



The target encoding time interval T fi (n) is inversely 
related to the desired frame rate. 

The average encoding time interval is determined using 
information on an actual encoding time interval for encoding 
the Inter-frame, the target encoding time interval, and the 
number of skipped frames due to buffer overflow. 



u 



(n)maxfr m {n-j),T fi (n-l)\ 
MAD(n) 
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The average encoding time interval is determined using the 
following equation: 



T m {n)~^-x)r m {n-l)+x 




5 wherein 

Tave(n) is the average time interval for encoding the Inter- 
frame, 

X is a weighting factor, 

T c (n) is the actual time for encoding the Inter-frame, 
10 F r is a predefined frame rate, and 
RT st is further defined as 

RTM = 0 if m3X^MT fi {n)}<^r-RT st {n--l) or N post (n)>O r 

F r 

15 otherwise, 

wherein N post (n) is the number of skipped frames due to 
buffer overflow and the |aj refers to the largest integer 
less than a. 

20 

The use of the sliding window based method for computing 
Tfi(n) has the advantage of reducing the effect of burst 
noise on the overall performance of the whole encoding 
process . 

25 

This simple method of adjusting the desired frame rate 
according to the invention is able to keep the quality of 
Inter-frames in a tolerable range under time-varying channel 
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bandwidth and sudden motion change without obvious 
degradation in the perceptual motion smoothness. 

The desired frame rate is determined using information on 
5 the average encoding time interval T ave (n), and hence based 
on the available computational resources. 

In each GOP, the target buffer level in each frame is 
predefined in a manner such that the more bits are allocated 

10 to the Inter-frames, preferably P-frames nearer to the I- 
frame of the GOP than the Inter-frames which are further 
away and belonging to the same GOP. In this way, Inter- 
frames which are near to the I-frame are encoded with a high 
quality, and subsequent Inter-frames which are predicted 

15 from these high quality Inter-frames are also of a high 

quality. As a result, the prediction gain based on these 
Inter-frames is improved. 

The target buffer level for the Inter-frame is predefined 
20 and determined using the following equation: 

T arg et(n) = T arg et(n - 1) - ^ ' * \W~ ( " + j) 

wherein 

Target (n) is the target buffer level, 
25 Ngop is the number of frames in a GOP, 
B s is the buffer size, 

B c is the actual buffer occupancy after the coding of I- 
frame, 
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S c is an average number of frames skipped due to 
insufficient available computational resources for encoding 
the Inter-frame according to the desired frame rate, and 
W^if) is the position weight of the 1 th Inter-frame which 
5 satisfies 



and 

10 

W^l^W^W^-^W^N^ -1) 



The average number of skipped frames due to insufficient 
15 computational resources is determined based on an instant 

number of skipped frames S c (n) due to insufficient 
computational resources when the Inter-frame is encoded. 
The instant number of skipped frames due to insufficient 
computational resources is determined using information on 
20 the actual encoding time interval and the target encoding 
time interval. The determining of the instant number of 
skipped frames due to insufficient computational resources 
can be summarized using the following equations: 

25 S c {n)=[TST{n)*F r \ 

wherein TST(n) is further defined as 
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TST{n) = maxjo,7§T(/i -l)+ max{r c (4?> {n)}-y 
and TSf(n-l) is defined as 

fsf(n-i) = zst(» -i)- 



wherein 

T c is the actual encoding time interval, and F r is a 
predefined frame rate. 



10 The average number of skipped frames due to insufficient 
computational resources is then determined using the 
following equation : 

15 

wherein 0 is a weighting factor. 

The advantage of using the average number of frames skipped 
S c instead of an instant number of skipped frames for 
2 0 computing the target buffer level is that the value of S c 
changes slowly. This slow change of S c coincides with a 
slow adjustment of a quantization parameter Q used for the 
encoding process of the video. 

25 It should however be noted that in an alternative embodiment 
of the invention, the instant number of skipped frames S c (n) 
can be used instead of the average number of skipped frames 
S c (n) to determine the target buffer level. 
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In the case when the channel bandwidth is constant, the 
complexity of each frame the same and the desired frame rate 
is guaranteed, the target buffer level for the n th Inter- 
5 frame in the i th GOP can be simplified to become 



As can be seen from the above equation, the target buffer 
10 level of the current Inter-frame is greater than the target 
buffer level of the subsequent Inter- frame s . In other 
words, more bits are allocated to the Inter-frame which is 
nearer to the I-frame belonging to the same GOP than the 
Inter-frame which is further away from the I-frame, i.e. 
15 from the Intra-frame. 

The target bit rate according to a preferred embodiment of 
the invention is determined based on the average encoding 
time interval, the average number of skipped frame due to 
20 insufficient computational resource, the target buffer 

level, the available channel bandwidth and the actual buffer 
occupancy. In particular, the target bit rate according to 
a preferred embodiment of the invention is determined using 
the following equation: 




B e {t IJ )-S*B s 



*W pos (n) 



25 



7(n)= max{0, M (< J*max{r^^^ 



wherein 

f(n) is the target bit rate, 
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t n ,i is the time instant the n th Inter-frame in the i th GOP is 
coded, and 

Y is a constant. 

5 Since the available channel bandwidth u(t n#i ) and the average 
encoding time interval T ave (n-1) are used to determine the 
target bit rate for the Inter-frame, the bit rate control 
method according to the invention is adaptive to both the 
available channel bandwidth and the available computational 
10 resources. 

The target bit rate for the Inter-frame determined above can 
be further adjusted by a weighted temporal smoothing using 
the following equation: 

15 

/(*) = maxJ-^ ^^ +i/ ^( rt ^i)^ x/ ( n ) + ( 1 ^^) x/ ( n ^ 1 )l 

wherein 

f(n) is the smoothed target bit rate, 
20 |i is a weighting control factor constant, and 

Hhdr(n) is the amount of bits used for shape information, 
motion vector and header of previous frame. 

It should be noted that in an alternative embodiment, the 
25 actual encoding time interval T fi (n) can be used instead of 
the average encoding time interval T ave (n) for determining 
the target bit rate. The advantage of using the average 
encoding time interval T ave instead of T c for the computation 
of the target bit rate is that T ave changes slowly. This 
30 also coincides with the slow adjustment of the quantization 
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parameter Q for the encoding process of the video sequence. 
Also when the actual frame rate is less than the predefined 
frame rate, i.e. 



more bits are assigned to each frame. Therefore, the 
possibility of buffer underflow is reduced compared to any 
existing rate control method, and the utilization of the 
10 channel bandwidth is improved. 

Once the target bit rate for each Inter-frame is computed, 
the corresponding quantization parameter for the encoding 
process can be computed, preferably using the Rate- 
15 Distortion (R-D) method described in [5] . 

In a post-encoding stage of the rate control method 
according to the invention, a sleeping time of the encoding 
process is updated using the following equation: 



wherein ST c (n) is the sleeping time of the encoding process. 
The starting coding time of the next frame is then given by 




20 




25 



SCT{n) - T c + SCT(n - 1) + ST C (w) 



wherein. SCT (n) is the starting encoding time, 
decoding time of the next frame is given by 



The starting 
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wherein SDT(n) is the starting decoding time. The starting 
5 decoding time is to be sent to a decoder to provide 

information on the time for decoding each frame of the 
encoded video sequence. 

Three points should be considered when determining the 
10 sleeping time ST c (n) and the starting decoding time SDT(n). 
No frame is to be encoded twice, the time resolution is 1/F r 
and necessary time should be elapsed when the buffer is in 
danger of overflow. 

15 Other objects , features and advantages according to the 
invention will be presented in the following detailed 
description of the illustrated embodiments when read in 
conjunction with the accompanying drawings. 

20 Brief Description of the Drawings 

Figure 1 shows a block diagram of the rate control method 



Figure 2 shows the channel bandwidth used for each frame of 



according to a preferred embodiment of the 



invention . 



25 



the "weather" and "children" video sequences. 



Figure 3 shows the computation time needed to encode each 



30 



frame of the "weather" and "children" video 
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sequences using the preferred embodiment of the 



invention . 



Figure 4 shows the comparison of the PSNR for the "weather 



Figure 5 shows the comparison of the PSNR for the "children 



10 Figure 6 shows the comparison of the actual buffer occupancy 



Figure 7 shows the comparison of the actual buffer occupancy 



Detailed Description of a preferred embodiment of the 
Invention 

20 Fig.l shows a block diagram of the rate control method 
according to a preferred embodiment of the invention. 

The rate control method according to the invention comprises 
the following three stages: 
25 the initialization stage, 
the pre-encoding stage and 
the post-encoding stage. 

In step 101, a frame rate F r is predefined for the encoding 
30 process for a Group of Pictures (GOP) . Practical issues 
like the parameters/specifications of the encoder and 
decoder are to be taken into consideration while choosing a 



5 



video sequence . 



video sequence. 



for the "weather" video sequence. 



for the "children" video sequence. 



15 
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suitable encoding frame rate at this point. Furthermore, it 
is not always known whether the hardware on which the video 
encoding process, including the rate control, is implemented 
can support the predefined frame rate. 

5 

In step 102, the buffer size for the video frames is set 
based on latency requirements. Before the encoding of the 
I-frame, the buffers are initialized at B s * 8 wherein B s is 
the buffer size and 5 is a parameter defined as 0^5^0.5. 
10 The I-frame is then encoded in step 103 using a predefined 

initial value of quantization parameter Q 0 . The encoding of 
the I-frame in step 103 may be implemented using any of the 
methods described in [1], [3], [4], [5]. 

15 After the I-frame is encoded, the parameters of a Rate- 
Distortion (R-D) model which is subsequently used to 
determine a suitable quantization parameter for encoding the 
corresponding frames of the video are updated in the post- 
encoding stage (step 104) . In a further step 105 of the 

2 0 post-encoding stage, the number of skipped frames due to 

buffer overflow N pos t(n) is determined, preferably using the 
method disclosed in [5] . 



In step 106, a sleeping time ST c (n) of the encoding process 
25 after the current frame is determined, wherein the sleeping 
time ST c (n) is used to determine a starting encoding time 
SCT(n) for the next frame. The determined starting coding 
time SCT(n) is then used to determine the starting decoding 
time SDT(n) of the next frame in step 107, wherein the 
30 SDT(n) is transmitted to the decoder. 
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Once the encoding of the I-frame is completed, the next 
frame, which is an Inter-frame is encoded using the 
quantization parameter which was determined in the previous 
post-encoding stage . 

5 

When the channel bandwidth or the statistics of the video 
contents is varying with time, the quality of each frame of 
the video sequence will vary significantly if the encoding 
frame rate is fixed at the predefined frame rate F r . To 
10 avoid this, a target or desired frame rate is determined in 
the pre-encoding stage according to the available channel 
bandwidth and any sudden motion change. 

An average encoding time interval T ave (n), or the average 
15 time interval needed for encoding an P-frame, is determined 
in step 108. The average encoding time interval T ave (n) is 
then used to determined a target encoding time interval 
T f i(n) in step 109. The target encoding time interval T fi (n) 
is inversely related to the desired frame rate. 

20 

The determined desired frame rate is then used to determine 
a target buffer level for the P-frame in step 110. In step 
111, the target buffer level, the actual buffer occupancy, 
the available channel bandwidth, the desired frame rate and 
25 the average encoding time interval T ave are used to determine 
a target bit rate f(n) for the P-frame. 



Based on the target bit rate f (n) , bits are allocated to the 
P-frame in step 112. The corresponding quantization 
30 parameter Q is computed as described in [5] in step 113 

using the updated R-D model from step 104. The quantization 
parameter Q is used to encode the P-frame in step 114. 
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When the next frame is a P-frame, the R-D model is updated 
again in step 104 of the post-encoding stage and the whole 
post-encoding and pre-encoding stage is iterated for 
5 encoding the next P-frame. If the next frame is an I-frame 
of a next Group of Pictures (GOP) , the encoding process 
starts again at step 101 for the encoding of the next I- 
frame. 



10 The implementation of the steps 108 to 111 of the pre- 
encoding stage and steps 106 and 107 of the post-encoding 
stage according to the invention will now be described in 
detail. 

15 After the coding of an i th I-frame, the initial value of the 
target buffer level is initialized at 

Tuget(P)-B e it ¥ ) (1) 



20 wherein 

B c (t i/ i) is the actual buffer occupancy after the coding of 
the i th I-frame, and 

ti,i is the time instant that the i th I-frame is coded. 



25 To determine the target bit rate of each P-frame of the GOP, 
the target buffer level for the P-frame needs to be 
determined. The first step of determining the target buffer 
level is to determine the desired frame rate. This is 
achieved by first determining the average encoding time 

30 interval of the P-frame T aV e(n) using the following equation 
(step 108) : 
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T„{n)-{l- X y m (n-l)+ X *msx^Mjr-mAn-l)} (2) 

wherein 
5 x is a weighting factor, 

T c (n) is the actual time for encoding the P-frame, and 
RT st is defined as 

RT si {n) = 0 if maxfrMTA n )}<^- RT «("-l) °* N post {n)>0, (3) 
otherwise, 

wherein [a J refers to the largest integer less than a. 

15 The weighting factor x is 0 < X < an <3 is preferably set 
to a value of 0.125. The initial value of the average 
encoding time interval T aV e(n) is given by 



^ c (0) = -^ (5) 

20 

and the initial value of RT st (n) is given by 

RT st (0)=0 (6) 



25 A variable B mad (n) is further defined by the following 
equation: 
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u 



■(it)max{r <wg (ii--l]tr ii (ii-l)} 
MAD{n) 



(7) 



wherein 

u(n) is the available channel bandwidth, and 
5 MAD (n) is the mean absolute difference between the current 
frame and the previous frame. 

The available channel bandwidth u(n) can be estimated by the 
method described in [6] . 

10 

An average value of Bmad(n) is then computed using the 
following equation : 



wherein 

TBmad(n) is the average value, of Bmadtn), and 

^ is a weighting factor, preferably at a value of 0.125. 

20 After the value of TB mad (n) is computed, the target encoding 
time interval T f i (n) can be calculated as below (step 109): 



(8) 



T p {n)-A,*T fl {n-l) if 5 JIIflli (n)> B t •IB mlt {n) , 
T fi {n)=A 2 *T fl {n-l) if {n)<B 2 *2B raB(f («), 

25 T fi {n) = T fl {n-l) otherwise. 



(11) 



(9) 



(10) 



wherein 

Ai is a parameter wherein 0.80 < Ai < 1.00, 
A 2 is a parameter wherein 1.00 < A 2 < 1.10, 
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Bi is a parameter wherein 1.00 < Bi < 2.00, and 
B 2 is a parameter wherein 0 < B 2 < 1.00. 

The value of the target encoding time interval T fi (n) 
determined from equations (9), (10) or (11) may further be 
adjusted using the following equation: 



After the desired frame rate is determined from the inverse 
of the target encoding time interval T fi (n) f the average 
number of frames skipped due to insufficient computational 
resources S c (n) is determined in order to determine the 
target buffer level. 

Two time variables are defined as follow: 




(12) 



wherein the initial value of T fi (n) is given by 



rM-j- 



(13) 




r 



(14) 



TST 



{n) = maxJo, f§f{n -l)+ max{r c (n),T fi («)}-— 




(15) 



wherein the initial value of TST (n) is given by 
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rcr(o)=o (is) 



5 



15 



An instant number of skipped frame S e (n) due to insufficient 
computational resources is then given by 

S c {n)=[TST(n)*F r \ (17) 

and the average number of skipped frames due to insufficient 
computational resources S c (n) is given by 

10 

S c (»)-[(l-0)S e (»-l) + ©*5 e (ii)J (18) 

wherein 9 is 0 < 9 < 1, and is preferably set at a value of 
0.125. The initial value of S c (n) is given by 

S c (0) = 0 (19) 



The target buffer level for the P-frame can now be 
determined using the following equation (step 110) : 

20 

T arg et(n) = T arg et(n - 1) - cK ^ - * * g W pos (n + /) 

(20) 

wherein 

Target (n) is the target buffer level, 
25 Ngop is the number of frames in a GOP, and 

W pas(D is the position weight of the 1 th Inter-frame which 
satisfies 
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N m -1 




and 



5 wm^wn)* -^^ (N sop - 1) . 



As the R-D model is not exact, there is usually a difference 
between the target buffer level for each frame and the 
actual buffer occupancy. A target bit rate is thus computed 
10 for each frame to maintain the actual buffer occupancy to be 
target buffer level. The target bit rate for each frame is 
determined by: 



t n ,i is the time instant the n P-frame in the i GOP is 
20 coded, and 

Y is a constant which is 0 < y < 1, and is preferably set at 
a value of 0.25 . 

Since the available channel bandwidth u(t n ,i) and the average 
25 coding time interval T a ve(n-1) are used to determine the 
target bit rate for each P-frame, the bit rate control 
method according to the invention is adaptive to the channel 
bandwidth and the computational resources. 



f{n) = max{0, u(t n , )* max^ (n -l), T fi (n)}+ (y -l)(i? c (t n . )-T arg et(n))\ (21) 



15 



wherein 



f(n) is the target bit rate, 
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Further adjustment to the target bit rate can be made using 
the following weighted temporal smoothing equation: 

/(/z) = maxJ ^ i +^(n-li/*x/(«)+(l-/i)x/(n-l)L 

5 (22) 
wherein 

f (n) is the smoothed target bit rate, 

Ms a weighting control factor constant which is set 
preferably at a value of 0.5, and 
10 Hhdr(n) is the amount of bits used for shape information, 
motion vector and header of previous frame. 

Once the target bit rate is determined, bits are allocated 
to each P-frame based on this target bit rate (step 112) . 
15 The corresponding quantization parameter Q is also 

calculated (step 113) using the method disclosed in [5] . 
The corresponding quantization parameter Q is then used for 
coding the P-frame (step 114) . 

20 After the coding of the P-frame is complete, the parameters 
of the R-D model is updated and the number of skipped frames 
due to buffer overflow are determined in the post-encoding 
stage (step 104,105), respectively, using the method 
disclosed in [5] . 

25 

In a further step of the post-encoding stage (step 106), the 
sleeping time of the encoding process after the current 
frame is determined using the following equation: 
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(23) 



wherein ST c (n) is the sleeping time of the encoding process. 

The starting encoding time of the next frame can then be 
obtained using the following equation: 

SCT{n) = T c {n)+SCT(n-l)+ST e (n) (24) 

wherein SCT(n) is the starting encoding time. The starting 
decoding time for the next frame can then be obtained using 
the following equation (step 107) : 

wherein SDT(n) is the starting decoding time. The SDT(n) 
for the next frame is then transmitted to the decoder to 
decode the next frame at the time indicated by SDT(n). 

It should be noted that in the determination of ST c (n) and 
SDT(n), no frame is encoded twice, the time resolution is 
1/F r/ and necessary time should be elapsed when the buffer 
is in danger of overflow. 

To demonstrate that the objective of the rate control method 
according to the invention has been met, the rate control 
method according to the invention and the rate control 
method used in the standard MPEG-4 encoding device are 
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applied to two video sequences, and their performances are 
compared accordingly. 

The two video sequences are referred as "weather" and 
5 "children", respectively, and are in the size of QCIF. The 
predefined frame rate, F c , is 30 fps (frames per second), 
and the length of each GOP is 50. The available channel 
bandwidth and the computation time used for encoding each 
frame of the video sequence are shown in Fig. 2 and Fig. 3, 
10 respectively. 



The actual frame rate is above 17 fps, which is less than 
the predefined frame rate of 30 fps. The initial buffer 
fullness is set at B s /8 and the initial quantization 
15 parameter Q 0 is set at 15. 



Fig. 4 and Fig. 5 show the Peak Signal-to-Noise Ratio (PSNR) 
of the "weather" and "children" video sequence using the 
rate control method according to the invention and the rate 
20 control method used in MPEG-4, respectively. 

The average PSNR of the "weather" video sequence using the 
rate control method according to the invention is 34.16 dB, 
wherein the average PSNR of the "weather" video sequence 

25 using the rate control method used in MPEG-4 is 32.6 dB. 

Similarly, the average PSNR of the "children" video sequence 
using the rate control method according to the invention is 
30.51 dB, wherein the average PSNR of the "children" video 
sequence using the rate control method used in MPEG-4 is 

30 29.87 dB. 
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Therefore, it can be seen that the average PSNR of the video 
sequences using the rate control method according to the 
invention is higher than using the rate control method of 
MPEG-4 . 

Fig. 6 and Fig. 7 show the actual buffer occupancy for the 
"weather" and "children" video sequences using the rate 
control method according to the invention and the rate 
control method used in MPEG-4, respectively. 



As can be seen from Fig. 6 and Fig. 7, the occurrence of 
buffer underflow using the rate control method of MPEG-4 is 
12 times for the "weather" video sequence and 18 times for 
the "children" video sequence. There is no buffer underflow 
15 for the two videos sequences using the rate control method 
according to the invention. 



10 
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What is claimed is 

1. A method for controlling the rate for encoding a video 
sequence, wherein the video sequence comprises a 

5 plurality of Group Of Pictures, wherein each Group of 

Picture comprises at least an I-frame and an Inter- 
frame, the method comprising the following steps for the 
encoding of each Inter-frame in the Group of Picture: 

• Determining a desired frame rate based on an 

10 available bandwidth of a channel which is used for 

transmitting the video sequence and on available 
computational resources for the encoding process; 

• Determining a target buffer level based on the 
desired frame rate and the position of the Inter- 

15 frame with respect to the I-frame; and 

• Determining a target bit rate based on the target 
buffer level and the available channel bandwidth, 
wherein the target bit rate is used for controlling 
the rate for encoding the video sequence. 

2. The method for rate control according to claim 1, 
comprising the further steps of: 

• Determining a target encoding time interval for the 
Inter-frame; and 

25 • Determining the desired frame rate based on the 

determined target encoding time interval. 

3. The method for rate control according to claim 2, 
wherein the target encoding time interval for the Inter- 

30 frame is determined based on the available channel 

bandwidth and an average encoding time interval used for 
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encoding the Inter-frame, wherein the average encoding 
time interval for the Inter-frame is proportional to the 
available computational resources for the encoding 
process . 



10 



4. The method for rate control according to claim 3, 

wherein the target encoding time interval for the Inter- 
frame is determined using the following equations: 



r>)=A*r>-i) 
r>)=^*r>-i) 

T fi {n) = T fi (n-l) 



if B mad (n)>B l *TB mad {n), 

if B ma An)<B 2 *TB mad {n), 
otherwise, 



15 



20 



wherein 

• Tfi(n) is the target encoding time interval for the 
Inter-frame, 

Ai is a parameter wherein 0.80 < Ai < 1.00, 
A 2 is a parameter wherein 1.00 < A 2 < 1.10, 
Bi is a parameter wherein 1.00 < Bi < 2.00, 
B 2 is a parameter wherein 0 < B 2 < 1.00, 
TBmadfn) is the average of B^Cn), and 
Bmad(n) is defined as 



25 



f(n)= MAD® 



wherein 

• u(n) is the available channel bandwidth, 

• Tave(n-l) is the average encoding time interval for 
the Inter-frame, and 



20 
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• MAD (n) is the mean absolute difference between the 
current frame and the previous frame. 

5. The method for rate control according to claim 4, 

wherein the target encoding time interval is further 
adjusted by 



T n (n)=mm\-^ > max\-^—,T fi (n) 

\ 4F r \ 4F r 



10 6. The method for rate control according to claim 3, 
wherein the average encoding time interval for the 
Inter-frame is determined based on an actual encoding 
time interval for the Inter-frame. 

15 7. The method for rate control according to claim 6, 
wherein the average encoding time interval for the 
Inter-frame is further determined based on the target 
encoding time interval and the number of skipped frames 
due to buffer overflow. 



The method for rate control according to claim 7, 
wherein the average encoding time interval for the 
Inter-frame is determined using the following equation: 



2 5 T ave (n) = (l- x)r w (n -l)+ % * max, 



T c {n\~RT st (n-l) 



wherein 

• x is a weighting factor, 

• T c (n) is the actual encoding time, 
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• F r is a predefined frame rate, and 

• RT s1: is further defined as 

RTM-0 if m™frMT fi {n)}<jr-RT„(n-l) or ^ (#i)>0 , 
otherwise, 

wherein N post (n) is the number of skipped frames due to 
buffer overflow. 

9. The method for rate control according to claim 5, 
wherein the target buffer level is determined such that 
an Inter-frame which is nearer to the I-frame in the GOP 
has a higher target buffer level compared to another 
Inter-frame which is further from the I-frame belonging 
to the same GOP. 

10. The method for rate control according to claim 9, 
wherein the target buffer level is determined using the 
following equation: 

T arg et(n) = T arg et(n - 1) cV '; 7/ s * y Wpos (n + j) 

gap 

wherein 

• Target (n) is the target buffer level, 

• Ng 0p is the number of frames in a GOP, 

• B s is the buffer size, 

• B c is the actual buffer occupancy, 
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S c is an average number of skipped frames due to 
insufficient available computational resources for 
encoding the Inter-frame according to the desired 
frame rate, and 

Wjm»(0 is the position weight of the 1 th Inter-frame 
which satisfies 

AT -1 



11. The method for rate control according to claim 10, 
wherein the average number of skipped frames due to 
insufficient available computational resources for 
encoding the Inter-frame according to the desired frame 
rate is determined based on an instant number of skipped 
frames due to the insufficient computational resources 
while encoding the Inter-frame. 

12. The method for rate control according to claim 11, 
wherein the instant number of skipped frames due to 
insufficient computational resources is determined based 
on the actual encoding time interval and the target 
encoding time interval. 




and 



W^a)*W„(2)*---*W„(N„ -1) . 



13. The method for rate control according to claim 12, 
wherein the instant number of skipped frames is 
determined using the following equation: 



10 
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s e (n)=[TST(n)*F r \ 

wherein TST (n) is further defined as 
TST{n) = maxjo, fsf{n-l)+ max{r c (n\T fi 

and TSf(n-l) is defined as 

fsf(n -l) = rar(*-i)- 1^- 1 )*^! 



wherein 

• 5 c (n) is the instant number of skipped frames due to 
insufficient computational resources, 

• T c (n) is the actual encoding time interval, and 
15 • F r is a predefined frame rate. 

14. The method for rate control according to claim 13, 
wherein the average number of skipped frames due to 
insufficient computational resources is determined using 
20 the following equation: 



S e {n)=$.-6)S c (n-l) + 0*S e {n)\ 
wherein 

25 • 9 is a weighting factor. 



15. The method for rate control according to claim 14, 

wherein the target bit rate is determined based on the 
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average encoding time interval for the Inter-frame, the 
average number of skipped frames due to insufficient 
computational resources, the target buffer level, the 
available channel bandwidth and actual buffer occupancy, 

5 

16- The method for rate control according to claims 8 and 
15, wherein the target bit rate is determined using the 
following equation : 

1 0 f(n) = max{0, u(t nJL )* max^ (n - 1^ T fi (n)}+ (y - l)(B c (t n , )-T arg et{n))\ 

wherein 

• f(n) is the target bit rate, 

t n ,i is the time instant the n th Inter-frame in the 
15 i th GOP is coded, and 

• y is a constant. 

17. The method for rate control according to claim 16, 

wherein the target bit rate is further adjusted by a 
20 weighted temporal smoothing using 



/(«)= 



max J 



u(t nt )*mzxfr ave (n -l\ T fii (»)} 



+H lldr (n -l), n x /(„)+ (l- J u)x f(n 



wherein 

25 • f(n) is the smoothed target bit rate, 

Ms a weighting control factor constant, and 
• Hhdr(n) is the amount of bits used for shape 

information, motion vector and header of previous 



frame , 
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18. The method for rate control according to claim 1, 
further comprising the following steps: 

• Determining a sleeping time of each frame after the 
frame is coded, 

• Determining a starting encoding time of each of the 
frame based on the computed sleeping time, 

• Determining a starting decoding time of a next frame 
based on the computed starting encoding time, and 

• Transmitting the determined starting decoding time 
to a decoder which is designed for decoding the 
video sequences. 

19. The method for rate control according to claim 18, 
wherein the sleeping time is determined according to the 
following formula : 



wherein ST c (n) is the sleeping time of the coding 
process . 

20. The method for rate control according to claim 19, 
wherein the starting encoding time is determined 
according to the following formula: 




SCT{n) = T c {n)+SCT(n-l)+ST c (n) 



wherein SCT(n) is the starting encoding time. 
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21. The method for rate control according to claim 20, 
wherein the starting decoding time is determined 
according to the following formula: 

SDT(n). \ SCT W' F ' I 
F r 

wherein SDT(n) is the starting decoding time, 

22. An apparatus for controlling the rate for encoding a 
video sequence, wherein the video sequence comprises a 
plurality of Group Of Pictures, wherein each Group of 
Picture comprises at least and I-frame and an Inter- 
frame, the apparatus comprises a processing unit being 
adapted to perform the following steps for the encoding 
of each Inter-frame in the Group of Picture: 

• Determining a desired frame rate based on an 
available bandwidth of a channel which is used for 
transmitting the video sequence and on available 
computational resources for the encoding process; 

• Determining a target buffer level based on the 
desired frame rate and the position of the Inter- 
frame with respect to the I-frame; and 

• Determining a target bit rate based on the target 
buffer level and the available channel bandwidth, 
wherein the target bit rate is used for controlling 
the rate for encoding the video sequence. 



23. A video encoding device for controlling the rate for 
encoding a video sequence, wherein the video sequence 
comprises a plurality of Group Of Pictures, wherein each 
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Group of Picture comprises at least and I-frame and an 
Inter-f rame, the encoding device comprises a processing 
unit being adapted to perform the following steps for 
the encoding of each Inter-frame in the Group of 
Picture : 

• Determining a desired frame rate based on an 
available bandwidth of a channel which is used for 
transmitting the video sequence and on available 
computational resources for the encoding process; 

• Determining a target buffer level based on the 
desired frame rate and the complexity and the 
position of the Inter-frame with respect to the I- 
frame; and 

• Determining a target bit rate based on the target 
buffer level and the available channel bandwidth, 
wherein the target bit rate is used for controlling 
the rate for encoding the video sequence. 
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